10 research outputs found

    On the Distribution of Control in Asynchronous Processor Architectures

    Get PDF
    Institute for Computing Systems ArchitectureThe effective performance of computer systems is to a large measure determined by the synergy between the processor architecture, the instruction set and the compiler. In the past, the sequencing of information within processor architectures has normally been synchronous: controlled centrally by a clock. However, this global signal could possibly limit the future gains in performance that can potentially be achieved through improvements in implementation technology. This thesis investigates the effects of relaxing this strict synchrony by distributing control within processor architectures through the use of a novel asynchronous design model known as a micronet. The impact of asynchronous control on the performance of a RISC-style processor is explored at different levels. Firstly, improvements in the performance of individual instructions by exploiting actual run-time behaviours are demonstrated. Secondly, it is shown that micronets are able to exploit further (both spatial and temporal) instructionlevel parallelism (ILP) efficiently through the distribution of control to datapath resources. Finally, exposing fine-grain concurrency within a datapath can only be of benefit to a computer system if it can easily be exploited by the compiler. Although compilers for micronet-based asynchronous processors may be considered to be more complex than their synchronous counterparts, it is shown that the variable execution time of an instruction does not adversely affect the compiler's ability to schedule code efficiently. In conclusion, the modelling of a processor's datapath as a micronet permits the exploitation of both finegrain ILP and actual run-time delays, thus leading to the efficient utilisation of functional units and in turn resulting in an improvement in overall system performance

    Improving Memory Hierarchy Utilisation for Stencil Computations on Multicore Machines

    Full text link
    Although modern supercomputers are composed of multicore machines, one can find scientists that still execute their legacy applications which were developed to monocore cluster where memory hierarchy is dedicated to a sole core. The main objective of this paper is to propose and evaluate an algorithm that identify an efficient blocksize to be applied on MPI stencil computations on multicore machines. Under the light of an extensive experimental analysis, this work shows the benefits of identifying blocksizes that will dividing data on the various cores and suggest a methodology that explore the memory hierarchy available in modern machines

    OMWS: A Web Service Interface for Ecological Niche Modelling

    Get PDF
    [EN] Ecological niche modelling (ENM) experiments often involve a high number of tasks to be performed. Such tasks may consume a significant amount of computing resources and take a long time to complete, especially when using personal computers. OMWS is a Web service interface that allows more powerful computing back-ends to be remotely exploited by other applications to carry out ENM tasks. Its latest version includes a new operation that can be used to specify complex workflows in a single request, adding the possibility of using workflow management systems on parallel computing back-end. In this paper we describe the OMWS protocol and compare its most recent version with the previous one by running the same ENM experiment using two functionally equivalent clients, each designed for one of the OMWS interface versions. Different back-end configurations were used to investigate how the performance scales for each protocol version when more processing power is made available. Results show that the new version outperforms (in a factor of 2) the previous one when more computing resources are used.The latest version of OMWS contains improvements coming from different sets of requirements originated from two projects that funded their corresponding implementation: EUBrazilOpenBio14, with grants from the European Commission and the National Council for Scientific and Technological Development of Brazil (CNPq) of the Brazilian Ministry of Science and Technology (MCT), and BioVeL, with grants from the European Commission. Server infrastructure was operated through a provisioning system developed in the frame of the Spanish project CLUVIEM (TIN2013-44390-R) funded by the "Ministerio de Economía y Competitividad".Giovanni, RD.; Torres Serrano, E.; Amaral, RB.; Blanquer Espert, I.; Rebello, V.; Canhos, VP. (2015). OMWS: A Web Service Interface for Ecological Niche Modelling. Biodiversity Informatics. 10:35-44. https://doi.org/10.17161/bi.v10i0.4853S35441

    TOWARDS OPTIMAL STATIC TASK SCHEDULING FOR REALISTIC MACHINE MODELS: THEORY AND PRACTICE

    No full text
    Task scheduling is a key element in achieving high performance from multicomputer systems. Efficient scheduling algorithms reduce the interprocessor communication and improve processor utilization. To do so effectively, such algorithms must be based on a communication cost model appropriate for computing systems in use. The optimal scheduling of tasks is NP-hard, and a large number of heuristic algorithms have been proposed for a range of differing scheduling conditions (graph types, granularities and cost or architectural models). Unfortunately, due both to the variety of systems available and the rate at which these systems evolve, an appropriate representative cost model has yet to be established. In this paper we study the problem of task scheduling unde

    Towards an Effective Task Clustering Heuristic for LogP Machines

    No full text
    This paper describes a task scheduling algorithm, based on a LogP-type model, for allocating arbitrary task graphs to fully connected networks of processors. This problem is known to be NP-complete even under the delay model (a special case under the LogP model). The strategy exploits the replication and clustering of tasks to minimise the ill effects of communication overhead on the makespan. The quality of the schedules produced by this LogP-based algorithm, initially under delay model conditions, is compared with that of other good delay model-based approaches

    using the UpRight Library APPROVED BY SUPERVISING COMMITTEE:

    No full text
    Firstly, I thank God, who works in ways beyond our understanding, but makes all things possible. I am grateful for the blessings I have been given in life — curiosity, skill and faith. Curiosity, to never stop asking questions and seek answers to them; skill, to solve the problems that I can solve on my own; faith to deal with those that I can’t. I am grateful for the blessing of a wonderful family, to which this thesis is dedicated. My parents, Santosh and Corinne have always been supportive of the choices I have made in my life. When I was unsure of whether pursuing a Master’s degree halfway across the world from home, was worth the cost and effort, they reassured me that it was. They were right. Without their love and support, I would not have been where I am today. My sister, Sonia, and her husband, Vinod, deserve their share of credit – their support, advice and reassurance provided me the motivation I needed in the last few months of pushing hard to get this work done. I thank Lorenzo Alvisi, my advisor, for the opportunity to work o
    corecore